Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pubsub notificaiton setup for low downtime migration #656

Merged
merged 19 commits into from
Oct 18, 2023

Conversation

darshan-sj
Copy link
Collaborator

@darshan-sj darshan-sj commented Oct 5, 2023

Creates pubsub topic, subscription and notification for each migration and for each Datashard(physical shard) in case of sharded migration.

  1. Sets up pubsub resources automatically during Stream setup
  2. Displays the resources in the prepare migration page along with Dataflow and Datastream resources.
  3. Cleansup the resources created when finish migration is clicked.
  • Tests pass
  • Appropriate changes to README are included in PR

Tested:

  1. MySQL Sharded migration with 2 physical shards and 4 logical shards with CDC.
  2. MySQL Non-sharded migration with CDC.
  3. Postgres Non-sharded migration with CDC.
  4. Without pubsub permissions initiating migration shows proper error message in UI.

Copy link
Member

@manitgupta manitgupta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the flow reads off for me. We need to refactor and name methods according to what they are doing.

Each method should follow a single responsibility principle and do one thing.

We also seem to have implemented bucket handling for multiple shards. We need to think this through - for example, how does having everything in bucket affect retries for a subset of shards. Are there any throughput constraints introduced?
These detail needs to be captured in the design.

conversion/conversion.go Outdated Show resolved Hide resolved
streaming/streaming.go Outdated Show resolved Hide resolved
streaming/streaming.go Outdated Show resolved Hide resolved
streaming/streaming.go Outdated Show resolved Hide resolved
streaming/streaming.go Outdated Show resolved Hide resolved
streaming/streaming.go Outdated Show resolved Hide resolved
streaming/streaming.go Outdated Show resolved Hide resolved
streaming/streaming.go Outdated Show resolved Hide resolved
webv2/config.json Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 11, 2023

Codecov Report

Merging #656 (02af4db) into master (998c8fe) will decrease coverage by 0.20%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master     #656      +/-   ##
==========================================
- Coverage   54.62%   54.42%   -0.20%     
==========================================
  Files          65       65              
  Lines       12272    12317      +45     
==========================================
  Hits         6703     6703              
- Misses       5102     5147      +45     
  Partials      467      467              
Components Coverage Δ
backend-apis 43.93% <0.00%> (-0.19%) ⬇️
backend-library 60.21% <0.00%> (-0.21%) ⬇️
cli 0.00% <ø> (ø)
frontend ∅ <ø> (∅)
Files Coverage Δ
internal/convert.go 65.35% <ø> (ø)
sources/common/infoschema.go 0.00% <ø> (ø)
sources/dynamodb/schema.go 76.34% <0.00%> (ø)
sources/spanner/infoschema.go 5.88% <0.00%> (ø)
sources/sqlserver/infoschema.go 56.96% <0.00%> (ø)
sources/mysql/infoschema.go 70.00% <0.00%> (-2.03%) ⬇️
sources/oracle/infoschema.go 55.49% <0.00%> (-1.34%) ⬇️
sources/postgres/infoschema.go 75.80% <0.00%> (-1.49%) ⬇️
webv2/web.go 36.05% <0.00%> (-0.28%) ⬇️

internal/convert.go Show resolved Hide resolved
streaming/streaming.go Outdated Show resolved Hide resolved
streaming/streaming.go Outdated Show resolved Hide resolved
streaming/streaming.go Outdated Show resolved Hide resolved
webv2/web.go Show resolved Hide resolved
streaming/streaming.go Outdated Show resolved Hide resolved
streaming/streaming.go Outdated Show resolved Hide resolved
streaming/streaming.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@VardhanThigle VardhanThigle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add to the read-me (or any suitable place in the documentation) that (something on the lines of below):

  1. smt will create pub/sub topics for each bucket.
  2. Users will need to give permissions to the gcs service agent (more details on the lines of reporting-changes Unfortunately we can't link to the subsection or say step-5 as the numbers could change.)
  3. [less important] The avro files would be copied to "/data/" folder within the root path specified by the user.

@darshan-sj darshan-sj requested a review from manitgupta October 12, 2023 17:47
Copy link
Member

@manitgupta manitgupta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good now, thanks for the updates.

Documentation changes are missing. We need to update both the existing images and text about how the pipeline works, along with new permissions and pre-requisites required to run HB.

sources/common/infoschema.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@bharadwaj-aditya bharadwaj-aditya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@darshan-sj darshan-sj requested a review from manitgupta October 17, 2023 06:46
@darshan-sj
Copy link
Collaborator Author

Documentation changes preview -- https://darshan-sj.github.io/harbourbridge/

@darshan-sj darshan-sj merged commit 386defc into GoogleCloudPlatform:master Oct 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants